歡迎來到第7課,我們將介紹 遷移學習。這種技術涉及重用一個已在大型通用數據集(如ImageNet)上訓練完成的深度學習模型,並適應於解決新的特定任務(例如我們的FoodVision挑戰)。在標註數據集有限的情況下,它對於高效達成頂尖成果至關重要。
1. 預訓練權重的力量
深度神經網絡以層次方式學習特徵。較低層次學習基本概念(邊緣、角落、紋理),而更深層則將這些特徵組合為複雜概念(眼睛、輪胎、特定物件)。關鍵洞察是,早期學到的基本特徵在大多數視覺領域中都具有 普遍適用 普遍適用性。
遷移學習的組成部分
- 來源任務: 在1400萬張圖像和1000個類別上進行訓練(例如ImageNet)。
- 目標任務: 將權重調整以分類規模小得多的數據集(例如我們特定的FoodVision類別)。
- 被利用的部分: 網絡參數的絕大部分——特徵提取層——直接被重用。
效率提升
遷移學習顯著降低了兩大資源障礙: 運算成本 (你避免了訓練整個模型數天)以及 數據需求 (只需數百個訓練樣本即可達到高準確率,而非數千個)。
TERMINALbash — pytorch-env
> Ready. Click "Run" to execute.
>
TENSOR INSPECTOR Live
Run code to inspect active tensors
Question 1
What is the primary advantage of using a model pre-trained on ImageNet for a new vision task?
Question 2
In a Transfer Learning workflow, which part of the neural network is typically frozen?
Question 3
When replacing the classifier head in PyTorch, what parameter must you first determine from the frozen base?
Challenge: Adapting the Classifier Head
Designing a new classifier for FoodVision.
You load a ResNet model pre-trained on ImageNet. Its last feature layer outputs a vector of size 512. Your 'FoodVision' project has 7 distinct food classes.
Step 1
What is the required Input Feature size for the new, trainable Linear Layer?
Solution:
The Input Feature size must match the output of the frozen base layer.
Size: 512.
The Input Feature size must match the output of the frozen base layer.
Size: 512.
Step 2
What is the PyTorch code snippet to create this new classification layer (assuming the output is named `new_layer`)?
Solution:
The output size of 512 is the input, and the class count 7 is the output.
Code:
The output size of 512 is the input, and the class count 7 is the output.
Code:
new_layer = torch.nn.Linear(512, 7)Step 3
What is the required Output Feature size for the new Linear Layer?
Solution:
The Output Feature size must match the number of target classes.
Size: 7.
The Output Feature size must match the number of target classes.
Size: 7.